Most Expensive Neighborhoods and Areas in Berlin

Dr. Lei Zhang

Berlin: arm aber sexy

Table of contents

Introduction: Business Problem

There is a saying about Berlin: "arm aber sexy". "arm" is a German adjective meaning "poor". Why is Berlin "arm"? Berlin was once a very rich city, but it became poor because it, as the capital of Germany, much more involved in the German history than any other city in Germany. It was tragically occupied and divided. However, after the fall of the Berlin Wall, Berlin is recovering very fast. Now Berlin, as the biggest city in Germany with more than 3.7 million residents, as a city full of histories, as an international multicultural metropolis, becomes one of the sexist city in Germany.

In this project we will try to find out where in Berlin have the highest rent in the next a few years. Specifically, this report will be targeted to stakeholders interested in investing an appartment house in Berlin, Germany.

We will try to detect

We will use our data to generate a few most promising neighborhoods and areas based on these criteria. A ranking list of the regions will be provided and the perspectives will be analysed, so that best possible final location can be chosen by stakeholders.

Data

Based on definition of our problem, factors that will influence our decission are:

While it is easy to find out the average rent for each borough, it is not easy to find out the average rent of a specific area, e.g. the average rent within a radius of 500 m from Berlin-Alexanderplatz. Also it is relatively easy to collect data showing the growth of the average rent of each borough over the years, but it is not easy to get those data for each neighborhood/area. We are going to analyse the collected data for Berlin boroughs and some indirectly related data for the neighborhoods and areas to estimate the more specific rents and their growth.

Following data sources will be needed to extract/generate the desired information:

General Information about Boroughs and Neighborhoods of Berlin

We'll use webscraping to get some general information about boroughs and neighborhoods of Berlin from the Wikipedia site Verwaltungsgliederung Berlins. Here is the table we get:

Then we manually input the rent of each borough of Berlin from 2009 to 2020 from the pdf files on the website Berliner Wohnungsmarktbericht.

Let's take a preview of the data.

The green line is the average rent of the whole of Berlin. Apparently, the three Boroughs Berlin-Mitte, Berlin-Friedrichshain-Kreuzberg, Berlin-Charlottenburg-Wilmersdorf are above average, and the others are below average.

Then we collect the demographic data about the population of each borough of Berlin from 2010 to 2019 from the same source.

Again, a preview to get a quick impression.

City, Borough, Neighborhood Coordinates

Let's get latitude & longitude coordinates of the center of Berlin and the center of each borough of Berlin using the Nominatim API and store them into a Pandas DataFrame.

Let's then plot these locations into a map to get an initial impression.

The middle red circle is the center of Berlin. It is the crosspoint of Friedrichstraße and Unter den Linden. The blue ones are the center of the boroughs.

This amplifies our data:

Now, let's get the coordinates for the neighborhoods. We will only look into the neighborhoods which we are interested in, namely those whose boroughs have an above average rent.

Again, we will plot the neighborhoods into a map to get some impression.

Foursquare

Now we have our location candidates, let's use Foursquare API to get info on the venues in each neighborhood. In particular, we are interested in the nearby hotels.

This concludes the data gathering phase - we're now ready to use this data for analysis to produce the report!

Methodology

The goal of this project is to find out the neighborhoods and areas of Berlin that would have the highest rent in the next a few years. Therefore, two factors are extremely import. One is the current average rent, the other is the expected growth of the rent. From the preliminary look on the data that we have collected, it is clear that currently Berlin-Mitte has the highest average rent - 13.70 EUR every square meter, Berlin-Friedrichshain-Kreuzberg comes with 13.11 EUR the second, and Berlin-Charlottenburg-Wilmersdorf with 12.38 EUR is also above average. Then we will come to the growth. It is quite clear from the line chart that none of the boroughs of Berlin whose average rents are below the Berlin-average could have a chance to climbe up to the top. Thus it is reasonable for us to only focus on these three above mentioned boroughs.

In first step we will use simple linear regression models to estimate the growth of the average rent of each of the three boroughs and Berlin as a whole from 2009 to 2020. We visualize the results and put them together to compare.

The second step of our analysis will be to use simple linear regression models to estimate the growth of the population of each of the three boroughs and Berlin as a whole from 2009 to 2020. We also visualize the results and compare them.

In third step we will create clusters of locations based on the patterns their nearby venues (k-means clustering and density-based clustering) for all the neighborhoods in the three boroughs having in mind that areas with similar patterns of nearby venues should have similar average rent. We will analyse the similarities among each cluster.

In the fourth step we will use the patterns that we discovered to estimite which areas of Berlin should have the highest rent in the next a few years.

In the last step we will use the data collected from Berliner Mietspiegel to test our estimation.

Analysis

In this simple linear regression model we use the years as independent variable to estimate the dependent variables - the average rent of each borough.

Here the coefficients indicate the rate of growth - the higher the number is, the faster it growth.

Let's visualize the result.

The models fit the samples quite well and no sign of overfitting. However, since the values only differ marginally, it is hard to compare them visually, even side by side!

So, let's put them together!

Clearly, Berlin-Mitte has not only the highest rent, but also the fastest rent growth! Berlin-Friedrichshain-Kreuzberg in the second place with respect to both current rent and rent growth.

Next, we move to population growth.

Clearly, Berlin-Mitte has the fastest population growth, and Berlin-Charlottenburg-Wilmersdorf is the second. But here one has to pay attention that population growth is not equal to population density growth, and the latter reflex more about the rent growth. Let's also consider the population density growth by dividing the current data by the areas.

Now our ranking comes back. It is consistent with the ranking of rent growth.

Next, let's visualize the population growth.

Now let's use density-based clustering DBSCAN to cluster the nearby venues. This clustering is only based on the locations of the venues, not base on the types of the venues. The advantage of DBSCAN is that it is robust against the outliers and you don't have to specify the number k which is an important parameter in the k-means clustering.

Here are the labels. The venues with label "-1" are the outliers.

Let's now visualize the clusters on the maps.

The result is no supprising. Since the clustering is based on the locations of the venues, the centroids of the clusters are basically the centroids of the neighborhoods, with expections cluster 1 and cluster 6 where large neighborhoods are clustered together. But we can still read some information out of this clustering: cluster 7,8,9,10 have very few venues in them. In Berlin, this means simply that these are forests. Living in forests can also be expensive, but you would not expect that the rent rises sharply there, because forest stays as forest unless there is a forest fire. Also, neighborhoods with outliers, i.e. those who get '-1' in their labels, do not look promising, because in crowded areas, the venues are usually clustered together.

Now let's cluster the venues based on their types. First of all, we should transfer the string values to numeric values in oder to perform analysis.

Now we take the mean of the values according to each neighborhood to see which kind of venue is more popular in the neighborhood.

Now we see that Berlin-Mitte-Mitte and Berlin-Mitte-Tiergarten are in the same cluster - cluster 0, Berlin-Friedrichshain and Berlin-Kreuzberg are in the same cluster 5. Based on what we have observed so far, there are good reasons to believe that cluster 0 is the top-tier, followed by cluster 5, and cluster 1 seems to be also competitive. Let's visualize the result.

Let's look into the pattern of the venues in each cluster.

We see that in cluster 0, hotel is the most popular venue type and there is the museum venue type, especially in Berlin-Mitte-Mitte. This tells us that these two places lie at the heart of the city's tourist attractions. Thus very high average rent is expected in each of these neighborhoods. We also see that Berlin-Mitte-Hansaviertel from cluster 1 has hotel as its most popular venues, although there are no museums.

It seems that the density of hotels is a good indictor of the average rent of an area. Let's look into that.

As expected, Berlin-Mitte has the most hotels in its vicinity. Although Berlin-Tiergarten and Berlin-Friedrichshain have the same number of hotels in their vicinities, considering the small size of Berlin-Tiergarten, hotels are obviously denser in Tiergarten than in Friedrichshain.

Let's visualize these observations.

Let's use DBSCAN again to cluster the hotels based on their locations.

Now we can present a ranking list of promising areas with highest rent for the next a few years to the stakeholders.

Let's create a map showing heatmap / density of hotels and try to extract some meaningfull info from that. Also, let's show borders of Berlin boroughs on our map and a few circles indicating distance of 1km, 2km and 3km from the center of Berlin.

This concludes our analysis.

Results and Discussion

We have found 5 center locations which are expected to have the highest average rent in the vicinity in the next a few years. Four of the zones are located in the Berlin borough Mitte, one of them belongs to the Berlin borough Friedrichshain-Kreuzberg.

We can then evaluate our findings using the Wohnlagenkarte2021. This is a heat map of the living conditions of areas of Berlin, the darker the color is, the better the living conditions of that area is supposed to be. To compare the heat map with our findings, one has to keep in mind that an area with "good living condition" does not have to be expensive. For example, a huge area southwest from the center of Berlin are marked as "good living conditions" by Berliner Mietspiegel. Most of this huge area belongs to the Berlin borough Steglitz-Zehlendorf, which is a former American sector before the fall of the Berlin Wall. However, as one can see, the average rent of this borough is below that of Berlin. Another very important point about this borough is that since most part of this borough is covered by forests, although the living conditions there are considered to be good by some people, the living conditions there will not become "better" very fast, so one does not expect a sharp increase of average rent in this borough. This is witnessed by our analysis on the rent growth and population growth.

The former British sector Charlottenburg-Wilmersdorf has some similarities with Steglitz-Zehlendorf. Compare with Steglitz-Zehlendorf it is much closer to the city center. Indeed, the avenue Kurfürstendamm which is located at the heart of Charlottenburg-Wilmersdorf is one of the busiest shopping and business avenue of the metropolis, and it was once the business center of West-Berlin. One similarity between Charlottenburg-Wilmersdorf and Steglitz-Zehlendorf is that they all belong to the former West-Berlin before the fall the Berlin Wall. Since the city west is already highly developed, the city east seems to be developing faster than its west counterpart. This means, in general, one expects that the average rent of the city east grows faster than that of the city west. This is partially the reason why the average rent of the former business center of West-Berlin - the area round Kurfürstendamm - grows slower than that of today's city center like Berlin-Alexanderplatz.

Since we were trying to find neighborhoods and areas in Berlin which have the highest average rent in the next a few years, we omitted those boroughs of Berlin whose average rents are below the average of the whole of Berlin. These include a huge borough: Berlin-Pankow, a former Soviet sector in the city east. However, this is not completely fair, because Pankow has a very central neighborhood - Prenzlauer Berg which is developing very fast in recent years. As one can see from the heat map of the Berliner Mietspiegel, Prenzlauer Berg is also a very promising area of Berlin. Although one can expect that no areas of this neighborhood would have a chance against the top 2 in our list, some areas of Prenzlauer Berg may take down the last two in our ranking list.

Lastly, we want to remark that the simple linear regression model we used to estimate the growth of the average rent in each borough of Berlin can only be used for ranking purpose, it is not suitable for a serious numeric estimation. The point is that there are so many important factors which could influence the average rent, but we did not take into account any of them. For example, on 23.02.2020 a new law Berliner Mietendeckel came into force in our socialist state Berlin. The law provides that all the rent in the rental agreement must be freezed for at least 5 years, and the new rent in a new rental agreement should not surpass the old one which was contracted before 18.06.2019. This law could cause a headache for the landlords in Berlin, because many tenants in Berlin are paying very cheap rents which were contracted decades ago and there is almost no way of ending the contract unless the tenants do not pay. The Berliner Mietendeckel together with the corona-pandemic dragged down the growth of the average rent in 2020 in Berlin, as we can see from our analysis. However, on 25.03.2021 the Federal Constitutional Court (Bundesverfassungsgericht) concluded that the Berliner Mietendeckel is unconstitutional, and is therefore void by law. This new conclusion of the Federal Constitutional Court plus Germany's slow recovery from the pandemic could cause the average rent in Berlin grow sharply as before. But there are many other factors to be considered as well, so the simple linear regression model is only under the assumption that the current political and economic situations are stable.

Conclusion

The purpose of this project is to identify the areas in Berlin which would have the highest rent in the next a few years. We started by studying the boroughs using the data from the Investitionsbank Berlin (IBB). We first concluded that only three boroughs of Berlin, namely Mitte, Friedrichshain-Kreuzberg, Charlottenburg-Wilmersdorf, have above average rents, so we narrow ourselves down to these three boroughs of Berlin. Then we analyse the rent growth and population growth of the three boroughs. We have concluded that Mitte has the highest average rent and rent/population/population-density growth, Friedrichshain-Kreuzberg has the second highest average rent and rent/population-density growth, while Charlottenburg-Wilmersdorf has the third highest average rent, second population growth, third rent/population-density growth.

Next, we look into the boroughs to find out which neighborhoods and regions should have the highest average rent. To do this, we employ the Foursquare API to check the patterns of the venues of those neighborhoods having in mind that neighborhoods with similar pattern of venues should be in the same rent class. By analysing the patterns we found out that the density of nearby hotels is a good indicator of the average rent level. Then we use clustering (DBSCAN) to cluster the nearby hotels of the neighborhoods in the three boroughs. By taking the centroids of each cluster, we made a ranking list of recommended Berlin areas for further analysis. These are:

  1. Location (52.52292197640304, 13.404153473698498) in Berlin-Mitte near Berlin-Alexanderplatz;
  2. Location (52.52292197640304, 13.404153473698498) in Berlin-Mitte near Gendarmenmarkt;
  3. Location (52.505297478714574, 13.35403627597626) in Berlin-Tiergarten near Kurfürstenstraße;
  4. Location (52.51138271709561, 13.452391736530407) in Berlin-Friedrichshain near Frankurter Tor;
  5. Location (52.51138271709561, 13.452391736530407) in Berlin-Hansaviertel and Berlin-Moabit near Turmstraße.

Needless to say, No.1 and No.2 which are located at the center of the Berlin metropolis are expected to have the highest average rents in the next a few years, and they grow very fast. But the property price is also expected to be very high.

No.3, which located at the west side of the center of Berlin, bridges the city west and the city east. From its central position in the city, one can also expect a very solid rent growth rate. The property price there would also be very high.

No.4 located at the city east, is one of the fastest developing regions of Berlin after the fall of the Berlin Wall. For stakeholders who are short of capitals, this is surely a wonderful area of investment with very high cost/performance ratio.

No.5 is located at the city west. The southern part of this area has very attrative living conditions, and it is well-known as a multicultural neighborhood of Berlin. It is definitely worth for an investment.